ON THE CONVERGENCE OF LACUNARY WALSH-FOURIER SERIES 



YEN DO AND MICHAEL LACEY 

Abstract. We show that for function f on [0, 1], with J |f|(loglog^ |f|)(logloglog^ |f|) dx < 
GO, and lacunary subsequence of integers {rij}, it holds that Snjf — > f a.e., where Smf is the 
m-th Walsh-Fourier partial sum of f. According to a result of Konyagin, the sharp integrability 
condition would not have the triple-log term in it. The method of proof uses four ingredients, (1) 
analysis on the Walsh Phase Plane, (2) the new multi-frequency Calderon-Zygmund Decomposi- 
tion of Nazarov-Oberlin-Thiele, (3) a classical inequality of Zygmund, giving an improvement in 
the HausdorfF- Young inequality for lacunary subsequences of integers, and (4) the extrapolation 
method of Carro-Martm, which generalizes the work of Antonov and Arias-de-Reyna. 



1. Introduction 

Let f be an integrable function on the torus T, which will be associated with the interval [0, 1] 
in this paper. We consider the Walsh system of functions on [0, 1] given by Wo(x) = 1, and for 
n > 1 , we write n = X.k=o ^^-^^ binary digits, and define 

r 

WnW := ]^(signsin(2''+Vx])^^ . 

k=0 

(We are reserving a lower-case w for Walsh wave packets defined in §2.) The Walsh system is a 
complete orthonormal system for L^(0, 1), and each f eV has the Walsh-Fourier representation 



f(x)Wk(x) dx 

[0,1] 



k>0 

The partial sums of the series above, Snf := ^k=of(^)^i< ^''^ *he concern of this paper, strongly 
motivated by the close analogy between the Walsh-Fourier series, and Fourier series. 

The question addressed here, and brought to our attention by the informative article of Konyagin 
[12], is this: If a given sequence of integers Uj is sparse enough, can one assert the pointwise 
convergence of the Walsh-Fourier sums Snjf for a broader class of functions than one has for the 
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full sequence of partial sums? Indeed, this is the case, as we will show in Theorem 1.4 below; our 
result comes close to resolving Konyagin's Conjecture 1.2 below. 

Let us recall the essential facts. It is well known that integrability of f is not enough for 
this pointwise convergence via a counter example of Kolmogorov [9], who in fact constructed 
an integrable function whose Fourier series diverges almost everywhere. Carleson [3] showed in 
his seminal paper the almost everywhere convergence of the Fourier series for f G L^, and Hunt 
[8] observed extension of Carleson's proof to L^^, 1 < p < oo. These results were reproved by 
Fefferman [6], and the approach of Lacey-Thiele [14] is modified in this paper. 

A result of Konyagin [11], extending the work of many, including [7, 13, 22], shows the following. 

1.1. Theorem. Let be a increasing convex function sucti ttiat ^[t) = o(tloglogt) as t ^ oo. 
Ttien, for any increasing sequence of integers tlj ttiere is a f E ^[^) such that supj |Snjf (x]| = oo 
for alixe [0,1]. 

Konyagin [12] has conjectured that the previous result is in fact sharp for lacunary subsequences 
of integers. A sequence (nj)j>i is called lacunary if 

mf^>l . 

Below, log^ X = 27 + max(0, logx).^ 

1.2. Conjecture. Let {rij : j > 1} 6e a lacunary sequence of integers. If 

|f(x]|loglog+ |f(x)|dx < oo , 
then Snjf(x) — > f(x) for almost every x E T. 

The following result about arbitrary lacunary partial sums is formulated in [12] and is attributed 
to Zygmund [23]. It provides an integrability condition sufficient for convergence of Fourier series 
better than what is known for the full sequence of integers, see §7. 

1.3. Theorem. Let {rij : ] > ^} be a lacunary sequence of integers. If 

|f(x)|log+|f(x)|dx < oo , 

Jt 

then Snjf (x) converges to f[x) for almost every x G T. 



^With this definition, you only need to have one log^ in the formulas. 
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Our main results are the following, which do not quite resolve Konyagin's Conjecture, but do 
certainly indicate that for lacunary sequences, one can have convergence for a much broader class 
of functions than that of the full sequence of partial sums. 

1.4. Theorem. Let {rij : ] > ]} be a lacunary sequence of integers. If 

|f|(loglog+ |f|)(logloglog+ If I) dx < oo 

Jt 

then the Walsh-Fourier partial sums Sn^f[x) — > f (x) for a.e. x G T. Furthermore, 

(1-5) II sup |Sn,f|||l,oo ^ ||f ||L{loglog_|_L)(logloglog+L) ) 

3 

(1-6) II sup |Snjf|||l < ||f ||L{logL)(loglog+L)[0,l] • 

j 

Note that in (1.5) and (1.6) we use the Luxembourg norms on the right hand sides. We'll recall 
some standard facts about these norms and Orlicz spaces at the end of this section. 

Our proof will employ the time-frequency analysis techniques introduced in Lacey-Thiele [14], 
recalled in §2, with an additional key ingredient, namely the multi-frequency Calderon-Zygmund 
decomposition introduced in Nazarov-Oberlin-Thiele [16]. In a Calderon-Zygmund decomposition, 
one decomposes a function into two parts, where the good part is bounded and the bad part is 
localized to a family of intervals where it has cancellation properties. The classical Calderon- 
Zygmund decomposition requires the bad part have mean zero on each interval, and in a multi- 
frequency decomposition one requires certain modulations of the bad part to jointly have mean 
zero. In [17], Oberlin and Thiele used a Walsh-Paley variant of this decomposition to extend 
boundedness results for a Walsh variant of the Bilinear Hilbert transform. In this sense, our proof 
is a continuation of this theme. In our setting, we are able to obtain improvement in Carleson's 
Theorem when the collection of frequencies is lacunary. Essentially, the estimate of the good part 
in the multi-frequency decomposition in [16] is based on Hausdorff- Young's inequality. It turns out 
that if the sequence rti < < ■ • • is lacunary, one can obtain an improvement of the Hausdorff- 
Young estimate, and this is the key to the improvement in Theorem 1.4. The improvement of the 
Hausdorff- Young estimate is due to Zygmund [23], see Proposition 3.3 below. 

Using the above ingredients, we will be able to show the following refined distributional estimate 

(ef]*(t)<^iog+iog+(^) , t>o , 

for any f majorized by F C [0, 1], see Lemma 3.2. From this estimate, the strong type estimate 
(1.6) follows easily. To obtain (1.5), we'll use the extrapolation technique of Antonov [1], which 
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has been extended and generalized in Arias-de-Reyna [2] and Carro-Martm [4]. Details about the 
proof of Theorem 1.4 are presented in §4 and §6, and more remarks about this intricate subject 
are included in §7. 

We recall standard facts about Orlicz spaces and Luxembourg norms. 

1.7. Definition. A function tJ; is an Orlicz function if it is a convex non-decreasing function on 
[0, oo] such that 

\1j(0) = , lim i^t) = oo . 

t— >oo 

For a probability space (O, P], we set t|)(L)(0) to be those functions f such that for some C > 
we have E\|j(|f|/C) < oo, and then define the corresponding Luxembourg norm of f by 

||f||^(L) :=mf{C : Ei|;(|f|/C) < 1}. 

If we write i|>(L)[I) for interval I, we mean that the probability space is I with normalized Lebesgue 
measure. And by t|>(L], we mean that the interval is [0, 1]. 

2. Tiles and Time-Frequency Algorithm 

We formulate the details of the Walsh Phase Plane; the linearization of the Carleson operator 
that is used in the rest of the paper; and some key details of the proof of Carleson's Theorem in 
[14]. The Walsh Phase Plane is the closed quadrant K.+ x R+ of the plane. A dyadic rectangle is 
of the form 

(2.1) p = I X o) = [m2\ (m + 1 )2') X [nl^, [n + 1 ]2^) 

for integers m, n, j,k. A tile is a dyadic rectangle of area 1, and a bi-tile is a dyadic rectangle 
of area 2. For a tile I x cu, we will refer to I as the time interval associated to p, and cu as the 
frequency interval. A bi-tile P can be split into a upper half Pu and a lower half Pf. Associated 
to a tile p is a Walsh wave packet Wp which, in the notation or (2.1), is 

wp(x) = wix^(x) :=2-i/^W,(2-nx-m2^)) 

It follows that Wp has norm one; is supported on I; and is orthogonal to any Wp/, where p' is 
a second tile that does not intersect p. 
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The variant of the Carleson operator we consider is defined as follows. For a measurable function 

N : R+ ^ R+, we set 

(2.2) ef(x) :=_^(f,wp,)wp,(x) 

P 

The sum is over all bi-tiles P C R+ x R+, and we do not indicate the dependence of this definition 
on the choice of function N. 

And, the Carleson Theorem for the Walsh Phase Plane is 

2.3. Carleson Theorem. For f G L^(R+) of norm one, and G C R+ of Lebesgue measure one, 



we have the estimate on the bilinear form 
constant is absolute. 



(Cf, g) < 1, where < |g| < 1g, and the implied 



2.4. Remark. The Carleson operator (2.2) is the discretization of the maximal operator 

SUp|Snf(x)| 

n>0 

(c.f. [20]). In our setting, the supremum is taken over a lacunary subsequence, therefore there 
will be the following additional restriction that will be in place in subsequent sections: With a 
lacunary sequence {rij : j > 1} fixed, we can assume that the function N(x) in (2.2) is defined 
on (a subset of) [0, 1] and range restricted to {uj : j > 1}. Thus, we only consider bi-tiles P so 
that the upper-half of the frequency interval of P contains at least one rij. Furthermore, we can 
and will assume that for every bi-tile P in the Carleson operator, the time interval Ip is supported 
inside [0, 1]. In particular, this means Cf is supported inside [0, 1]. 

We recall the key elements of the proof of Theorem 2.3, following the lines of analysis of [14]. 

The set of bi-tiles admits a partial order, which we write as I x cu < I' x cu' iff and only if I C I' 

and uj' C cu. It follows that two bi-tiles P, P' tiles are related by this order if and only if they 

intersect in the Phase Plane. We then define 

, KxGrnG : (x,N(x))GP-}| 
(2.5) dense(Pj = sup — . 

P'=I'xu)' : P<P' 

If P is any collection of tiles, we set dense(P) := sup^gp dense (p). 

A tree is a collection T of bi-tiles such that there is a (non-unique) bi-tile It x tu^ such that 
P < It X CUT for all P G T. We define 



sizef (P) := sup 

T 



pgt 

PfnlTXti'T=0 



1/2 
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where the supremum is formed over all trees T C P. It is essential to note that the sum is 
restricted to those tiles P G T for which the lower-half does not intersect the top of the tree. 
We add the subscript f as in the application of these concepts, we will be changing f. 

We we will write energy(P) < A if the collection of bi-tiles P is the union of trees T G T, 
such that 

^ |ItI < a. 

Ter 

These next Lemmas give a quick proof of Carleson's Theorem, and we will have recourse to 
them, and their consequences in this paper. 

2.6. Density Lemma. Any collection of tiles P can be written as Psmaii U Pbig where these 
conditions hold. 

(1) dense(Psmaii) < ^dense(P); 

(2) energy(Pbig) < dense(P]-%|. 

(Recall the role of the set G in Theorem 2.3 and (2.5).j 

2.7. Size Lemma. Any collection of tiles P can be written as Psmaii UPbig where these conditions 
hold. 

(1) size^Ps^aii) < isizef(P); 

(2) energy(Pbig) <sizef(P)-2||f||i 

(Note the role of\} in this estimate.) 

For collections of tiles P we will use the notation 

(2.8) Bp(f, g) := _^(f, wp^) (wp^1(x,n(x))gp,, g) 

PeP 

2.9. Tree Lemma. For any tree T we have the estimate 

BT(f, g)| < dense(T)sizef(T)|lTl 

The next Lemma relates the concept of size to that of the Maximal Function of f. It is a 
consequence of the Calderon-Zygmund theory associated with trees. 

2.10. Lemma. Let f G L\ A > 0, and let P be a collection of bi-tiles so that for a// 1 x cu G P 
we have I n {Mf < A} ^ 0. We then have 

(2.11) sizef(P]<A. 

(In particular, size is bounded by the L°° norm off.) 
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To illustrate the Time Frequency Algorithm used in this paper, let us give a proof of Carleson's 
Theorem, conditional on the Lemmas above. 

Proof of Carleson's Theorem. We can assume that f G fl L°°. Then, by (2.11), it follows that 
we have an upper bound on the size of the collection of all tiles Paii. Hence, we have both 
the Size and Density Lemmas available. Appropriate inductive application of them leads to a 
decomposition of Pan into collections Pn, for n G Z, such that 

(1) dense(Pn) < mm{1,2-^}; 

(2) sizef(Pn) < 2-/2; 

(3) energy(Pn) < 2^; 

(Note that the density is never more than one, and that energy estimate matches the conclusions 
of the Size and Density Lemmas.) In particular, P^ is the union of trees T G 7^ such that we 
can estimate 

Y_ |(f,Wp,)(wp,l(,,N(x]]eP„,g)| <niin{1,2--}2--/2 Y_ I^tI 

PePn TeTn 

< min{1 , 2-^} 2^/2 = min{2^/^ 2-^/^} , n G Z . 

The latter estimate is summable over n G Z, so the proof is complete. □ 

In subsequent sections, the following situation will appear. Suppose that a collection P of tiles 
satisfies 

(2.12) energy(P) < dense(P)"^ |G| . 

It follows that one has the estimate 

Bp(f, g)| < dense(P)sizef(P)energy(P] < |G|sizef(P) . 

We can do better than this estimate if it is more effective to apply the Size Lemma. This leads 
to the following Lemma, also see [15]. 

2.13. Lemma. Assume that P satisfies (2.12), and let f G L^. We have the following estimate 

(2.14) |Bp(f, g)| < min|sizef(P)|G| , densefP)^/^ v^H^H^} 

Proof The first estimate follows immediately from assumptions and the Tree Lemma. So, we 
assume that the second term on the right in (2.14) is the smaller of the two. That is, we assume 
that we have the inequality 

sizef(P)-2||f||^ < 5-^\G\ 
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where we set dense(P) = 6. The left hand side of the last display is exactly the estimate on 
Energy that we would get by application of the Size Lemma. Hence, it is more efficient to apply 
the Size Lemma until the Energy estimate it provides matches the right hand side of (2.12). 

To be precise, set integer tlo to be the integer part of — log2[5||f ||2|G|^^]. And write P as the 
union of collections Pn, for n < no, where the collections Pn satisfy 

(1) dense(Pn) < S; 

(2) size(Pn) < 2-^/2; 

(3) energy(Pn) <2-||f||i 

This decomposition is obtained by solely applying the Size Lemma, until the last step when 
n = TLo, when the conclusions will follow from the assumption (2.12). We then have 



Bpjf,g) 



< 62"/2||f||2, 



which is a geometric series which sums to its at most a constant times it largest term, for n = no, 
yielding our Lemma. □ 

3. A Restricted Weak- Type Inequality 

In this and the subsequent Sections, we shall fix a lacunary sequence < n-\ < < ■ ■ ■ < 
of frequencies. All the implicit constants in the estimates will be independent of N, but depends 
on the lacunarity constant infjnj+i/nj > 1. 

Recall the definition of the Carleson operator C in (2.2). The observations of Remark 2.4 will 
be in force, and we will use the notation for the bilinear form Bp in (2.8). 

3.1. Definition. To say that G' is a major subset of a set G means that G' C G and |G'| > ^|G|. 

3.2. Lemma. Let F, G C [0, 1]. Then there is a major subset G' of G such that iff is dominated 
by F and g is dominated by G ' then 

(ef,g)|<|F|loglog+(|G|/|F|) . 

We recall the following key inequality of Zygmund [23], which can be viewed as an improvement 
of Hausdorff- Young's inequality in the lacunary setting. 

3.3. Proposition, /.et {n, : j > 1} 6e a lacunary sequence of integers. We have the inequality 



{f(nj) : j > 1} 



f2 ^ PllL(logL)V2 
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We only indicate the proof here. The dual space of L Vlog L is the space exp(L^), which is 
the Orlicz space associated with the Orlicz function e"^— 1. Moreover, there is a version of the 
Khintchine inequality which holds for the Walsh-Paley functions {Wn^ : j > 1}, which is phrased 
in terms of the exp(L^) norm. Namely, 



(3.4) 



< 



exp(L2) 



J > 



U2 



We can then prove the Zygmund inequality as follows. For f G L(logL)^/^, let 4) = ^^^^ f (nj)Wnj 
be the projection of f onto the lacunary frequencies, and observe that 

< l|f|lL(logL)V2||4)||cxp(L2) 



{f(nj] : j>l} 



< 



{f(nO : j>l} 



«2 



lL(logL)l/2 

And this completes the proof. The reader can compare this argument to [23, Ch. XII, 7.6]. 

Concerning (3.4), we are sure that this is known, but could not locate an explicit reference to 
it in the literature. One can modify the argument in [18] to show the equivalent form of (3.4) 



3=1 



< Cv/p||{aj : j > 



1 < p < oo . 



Alternatively, one could show that the Haar Littlewood-Paley Square Function of ij^Uj has 
L°°-norm at most C||{aj : j > ^}\\(2, and then appeal to the Chang-Wilson-Wolff inequality, see 
[5]. 

Proof of Lemma 3.2. Clearly if |G| < |F| then the desired estimate follows from boundedness of 
the (lacunary) Carleson operator. We'll assume below that |F| < Co|G| for some absolute constant 
Co. 

We'll take G' = G \{M1f > A} where we choose A ~ |G|-^|F| so that G' is a major subset of 
G. Furthermore, we can choose Co small enough such that A < 1, and it is then not hard to see 
that f is supported inside {Mlp > A}. Below we show that this choice of G' works. Let I be the 
maximal dyadic intervals I C {Mlp > A}. We then have |Fn I| < A|I| for lei, and 



(3.5) 



^|I|<|{Mlp>A}|<A-i|F| 



lei 



Let P be those bi-tiles P with Ip^ n F ?t and Ip, n G' 0. It is clear that (Cf, g) = Bp(f, g). 
We then decompose P = Uk>oPk, using only the Density Lemma, see Lemma 2.6. Thus, Pi^ is 
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a union of trees T in collection Ti^, so that dense(Pic] < 2^^, and the estimate (2) of the Density 
Lemma holds, namely 



energy(Pk)< |It| < 2'^|G| . 



TeTk 



It follows from (2.11) and the first half of the estimate (2.14), that we have 



Bpjf,g) <A|G|<|F 



k> 1 . 



We will use this estimate for 1 < k < ko = Clog log 



■+ A 



J, which is consistent with the estimate 



we want to prove. 

We begin the multi-frequency part of the proof. Fix k > ko. For convenience we suppress the 
dependence on k in the following estimate of Bp|^(f, g), except for Pi<. For I G I, let Qi be those 
tiles p with time interval I, which as rectangles in the phase plane intersect the lower-half of some 
bi-tile P G Pk- Take any such pair (p,P). By the construction of P^ and I, it follows that we 
must have p < P^. The inequality between p and the lower part of P must be strict, hence we 
must have p < Pu. Furthermore, using standard properties of Walsh packets, it follows that wp,, 
is a scalar multiple of Wp on Ip^. 

We set 4)1 to be the projection of f onto the space spanned by the wave-packets {wp : p G Qi}, 
and set cf) = Y.iei ^i- '^^^^ implies that for any P G Pk and any I G I we have 



Indeed, since <pi is supported on I, we can assume Ip^ fl I 0. Then there will be an element 



equality follows. 

Since f is supported inside the union of intervals in I, it then follows that we have Bp^(f, g) = 
Bp|^(4), g), and our objective is to use the Zygmund inequality to provide a favorable estimate for 
the L^-norm of cf). 

We check that the Zygmund inequality applies to the tiles in Qj. Let a > 1 be the lacunarity 
constant of the sequence {rij : j > 1}. For a tile p in this collection, write the frequency interval 
of p as [fXp, [ip + Taking a different p' G Qi, with njfp/] > nj(p), we have 



(fli-(t)i,wp,) =0 . 



p G Qi such that p fl P« 0, and so we can replace wpjli by a multiple Wp, and the desired 




> a — 
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Note that except for the first 1 + [^] tiles, we will have Tijfp] > (1 + [^])|ir > ^I^T ■ 
Hence, after at most Oafi) initial terms the sequence {(Xp : p e Qi} has lacunarity constant at 
least (a+ l)/2, and so the sequence has a lacunarity constant that is only dependent on a. 

We estimate as below, where we will be using the Zygmund inequality, which requires an 
appropriate renormalization of the interval I. 

||4)i||2 = ||{(f,wp) : pG Qi}||^2 

^ l|fll|lL(logL)'/2(i)|lf''^ 



< 



.(10 



IF nil 

A 



It follows from (3.5) and A ~ |F| • |G|"^ that 



2 < A(log, ^)^''|{M1, > A}|V2 < |F| . |GrV^(log, 1)^/' , 
We now turn to the second half of the estimate (2.14) to see that 

BpJf,g)| = |BpJ(t),g) 

<2-/^V^. 114,11, <2-/^|F|(log,l)^/\ 

By choice of ko = C loglog^(|G|/|F|), we can sum this estimate in k > ko to conclude the proof 
of the Lemma. □ 

3.6. Remark. Some of the arguments in the proof above we have learned from the last two pages 
of [17], and specialized to the lacunary setting. 

4. Proof of the Strong-Type Estimate (1.6) 

We turn to the proof of the strong-type estimate that C maps LlogL(loglogL) into L^ The 
intermediate inequality we prove is this: For any F C [0, 1] and function f dominated by F we 
have 

(4.1) lieflh <|FKlogjFrM(loglogjFrM . 



The main idea of the proof is the following principle: one can pass from a restricted weak-type 
inequality, together with the estimate, to strong-type inequality with a loss of a log term. 
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Specifically, by Lemma 3.2, for any set G C [0, 1], there is a major subset G' C G so that for any 
g dominated by G', 

(4.2) Kef,g)|<|F|(loglog+|FM]. 

We apply (4.2) to the set Go = [0, 1], getting major subset Gq, and then recursively apply the 
inequality to Gi = GqXGq. After a number to of steps, we will have IGtJ < at which point 
we stop the recursion, and set G^^ = Gt^. It is not hard to see that we can take 

to = 2+ [logjFMl . 

Now, for appropriately chosen functions gt dominated by G^, for < t < to, we have 

liefih < j^(ef,g,) 

t=o 

to 

<}^|F|(loglog+|Fr) 

t=o 

<|F|(log+|FM)aoglog+|FM). 

We use (4.2) for < t < to, and for the last term, we simply use the inequality for Cf. 

To conclude the inequality (1.6) from the intermediate (4.1), one relies upon the fact that an 
arbitrary (p E L log L (log log L) is a convex combination of functions of the form 

f-[|F|log^|F|-Uloglog+|F|-^)]-^ 

where f is dominated by F. We omit the straight- forward proof of this fact. The reader should be 
well-aware that the same comments do not hold for the weak-type estimate, which is the focus 
of the next section. 

5. A Sjolin-type distributional estimate 

Much of the remaining arguments needed to conclude the weak-type estimate (1.5) are derived 
from observations brought to bear on the question of the convergence of the Fourier sums along 
the full sequence. Here, and in the remainder of the paper, we let Cfunf = suPt^>i |Snf|. The 
estimate of Sjolin [19] is 

(ef,iiiF)*(t)<^iog+(p), Fc[o,i],t>o. 
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We denote the decreasing rearrangement invariant function h* of function h by 
(5.1) h*(t) = inf{s > : |{x : |h(x)| > s}| < t} . 

Our purpose here is to establish the lacunary version of this inequality, and then use an observation 
of Antonov [1] to obtain a particular extension of this result. 

5.2. Lemma. For any t > and any f majorized byfc [0, 1], we have 

(e|acf)*(t) < Hloglog^(l) , F C [0, 1] , t > 0. 

Here and below we set Gi^J = supj>i |Snjf |. 

Proof. For any s > 0, let Gg = {Cf(x) > s}. By Lemma 3.2, there exists a major set G^' such 
that for appropriate g majorized by G'^ we have 

sIGsl <2s|G:| < (ef,g) < |F| loglog+(|Gsl/|F|) 

or equivalently s < (|F|/|Gsl) loglog^(|Gs|/|F|). Therefore if s > log log^( ) for some large 
absolute constant C we'll have 

|FL , t |F| IGsl 

So by the strictly increasing property of sloglog^(1/s), we obtain 

|F|/t < IFI/IGJ , 

or equivalently |Gsl < t. This completes the proof of the Lemma. □ 

Now, we will use a key observation of Antonov [1] to remove the restricted-type assumption 
in Lemma 5.2. Unlike previous Lemmas where the bi-tiles in the definition of C could be arbi- 
trary, in the following Lemma (and hence subsequent Lemmas) we need to know that 6 is the 
actual discretization of the lacunary maximal Carleson operator used in the Lemma above. The 
observation of Antonov is the following Lemma. 

5.3. Lemma. For M G N, set S^f = supi<^<^^ |Snf|. For every e > 0, and function < f < 1 
supported in [0, 1], ttiere is a set F C [0, 1] witli ||f||i = |F| and moreover ||S'^(f — 1f)||oo < £■ 

A proof of Antonov's lemma in the Walsh-Fourier setting could be found in Sjolin-Soria [20]. 
Using Antonov's observation, have there holds the following. For any t > 0, we have 

(5.4) (e,3cf)*(t) < ^loglog^(^) , < |f| < 1 . 
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(5.5) (ef,„f]*(t) < ^log+(^) , < |f| < 1 . 

6. Proof of the Weak-Type Estimate (1.5) 

Antonov [1] used (5.5) to derive his conclusion that Cfun maps L log L log log log^ L into V'°°, 
which remains the best known result for the full sequence of integers. His argument was fur- 
ther generalized by Arias-de-Reyna [2], which language was phrased in that of interpolation and 
extrapolation theory. The latter approach has been revisited by others, with the relevant point 
for us that the starting point is the distributional inequality (5.5), or more generally something 
of the broad form of (5.5) or (5.4). In our setting, we are fortunate that the investigations of 
Carro-Martin [4] are nicely suited to derive our weak-type estimate. 

We will be a little brief about this, as the Theorems of Carro-Martin apply in an uncomplicated 
fashion. The extrapolation theory of Carro-Martm starts with a sublinear operator T such that 
for any f G fl L°°[0, 1] with ||f||oo < 1 we have, using the notation of (5.1), 

(6.1) (Tf)*(t)<D(||f||OR(t) . 

Then under mild assumptions on D and R, Carro and Martin shows that T is bounded from a 
logarithmic type space Qd to a weighted Lorentz space Mr. 

For convenience we shall refer to those functions f G fl L°°[0, 1] with ||f ||oo < 1 as atoms. 

6.2. Definition. Let D : (0,oo) (0, oo) be a concave function such that D(0+) = 0. Then 
Qd is the space of functions f such that there exists a decomposition of f 

f = ^ aicfk , ai, > , 

k 

where {f^} are atoms, and a scalar partition of unity Y.k^^ ~ ^ (with bk > 0) such that the 
following sum is finite: 

k ^ 

The infimum of all such sums is denoted by ||f||Qo. 

6.3. Definition. Let R : (0, oo) (0, oo). Then Mr is the space of functions f such that 

f*(t) 

||f||MR sup-—— < oo . 

t>0 K[tJ 

In particular, when R(t) = 1 /t the space Mr becomes the usual V'°°. 
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In the following theorems, extracted from [4], we assume that D, R are respectively eligible for 
the above definitions. 

6.4. Theorem. [4, Theorem 2.1] Assume that T is a sublinear operator such that for any atomic 
f and any t > we have 

(Tf)*(t)<D(||f||OR(t) . 
Assume that tR(t) is a nondecreasing function. Then T is bounded from Qd to Mr. 

Next, the space Qd and Orlicz spaces are related. 

6.5. Definition. Let D : (0, oo) (0, oo] be a concave increasing function such that D(0+) = 0. 
Then Llogloglog^ L(D) is the space of functions f such that 



|f||Llogloglog+L(D) 



rl 1 

f*(t)logloglog+-dD(t)<oo . 
t 



6.6. Theorem. [4, Theorem 2.2(3)] Assume D is also increasing. If D{s) > s for any s > 0, and 

D(s^) < sD(s) for any < s < ^ , then 

LlogloglogL(D] c Qd . 
6.1. Proof of the weak type estimate (1.5). In our case, we'll have (6.1) with 

D(s)=sloglog,(l), R(t) ^ 



•s^ t 

To see this, note that this follows from (5.4) if t < 1. When t > 1, using the trivial bound 

KCf (x) > s}| < 1 

which holds for any s (since Cf is supported in [0, 1]), we obtain (Cf)*(t) = 0, and the factorization 
estimate (6.1) follows immediately. We note that technically the above function D(s) is not 
concave for some range of s near 1, so what really happens is we use a concave approximation 
of D that is comparable (up to the first derivative) to D near these values of s. We will abuse 
notation and use D in the sequel without any further comment. 

Now, the extrapolation method of Carro-Martm will give the following estimate: 



rl 

lefllioo < 



< 



f*(t)(logloglog+l]D'(t] dt 
t 

r(t)(loglog+-)(logloglog+-) dt 



t^v " " "^t 



L[loglogL)(logloglogL) ) 
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note that the last equivalence is a known equivalent way to express the Orlicz norm. As it is classi- 
cal that we have Snjf ^ f a.e. for bounded functions, which are dense in L (log log L) (log log log L), 
this proves the a.e. convergence for all f in this space. 

7. Concluding Remarks 

For a lacunary sequence of integers {uj}, there is a direct way to see that S-n-f converges to f 
a.e. for f G L(logL)^'^^. We indicate this here. Letting Vnf denote the de la Vallee Poussin sums, 
we of course have Vnf converging a.e. to f. And, one can see that the inequality below 

1/2 



^|Vn,f-Snif|' 



1 ,oo 



^ PllL(logL]'/2 



is a corollary to the endpoint Marcinkiewicz multiplier theorem of Tao-Wright [21]. This paper 
has interesting variants of the Zygmund inequality. 

If we consider the full sequence of partial Walsh-Fourier sums, we have no better estimate than 
Ha usdorff- Young to use in the multi-frequency argument. We have not seen an estimate that 
would improve our knowledge of the convergence of the full sequence of Walsh-Fourier sums. 
Indeed, if we consider any sequence that grows more slowly than lacunary, it would seem that 
only the Hausdorff- Young inequality is available in the multi-frequency argument. 

Konyagin has showed that for special sequences of indices the corresponding partial sums of the 
Walsh-Paley series converge almost everywhere to f for any f G L^(T) [10]. These are sequences 
of indices such that if we write each index in binary form then there there is an uniform bound 
on the number of times the digits alternate between and 1. In particular, the sequence of 
powers of 2 falls into this category, although it is not hard to construct a lacunary sequence of 
integers without this property. He has posed the question of characterizing those sequence of 
integers {iij} for which the Walsh-Fourier series Snjf converge pointwise to f for all integrable f, 
see [12, Problem 3.3]. There are more points of interest in this paper; the interested reader is 
encouraged to read it. 
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